Overview

Dataset statistics

Number of variables14
Number of observations4872
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory533.0 KiB
Average record size in memory112.0 B

Variable types

NUM12
CAT2

Warnings

citric_acid has 117 (2.4%) zeros Zeros

Reproduction

Analysis started2020-11-21 19:31:20.736055
Analysis finished2020-11-21 19:31:40.476488
Duration19.74 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

Distinct3974
Distinct (%)81.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2041.144089
Minimum0
Maximum4897
Zeros2
Zeros (%)< 0.1%
Memory size38.1 KiB
2020-11-22T03:31:40.587544image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile152.1
Q1807
median1652.5
Q33283.25
95-th percentile4575.45
Maximum4897
Range4897
Interquartile range (IQR)2476.25

Descriptive statistics

Standard deviation1441.655691
Coefficient of variation (CV)0.7062978546
Kurtosis-1.117858499
Mean2041.144089
Median Absolute Deviation (MAD)1101.5
Skewness0.4119395107
Sum9944454
Variance2078371.131
MonotocityNot monotonic
2020-11-22T03:31:40.753663image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
62< 0.1%
 
2622< 0.1%
 
2862< 0.1%
 
2942< 0.1%
 
3022< 0.1%
 
3062< 0.1%
 
3102< 0.1%
 
3182< 0.1%
 
3302< 0.1%
 
3382< 0.1%
 
Other values (3964)485299.6%
 
ValueCountFrequency (%) 
02< 0.1%
 
12< 0.1%
 
21< 0.1%
 
32< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
48971< 0.1%
 
48961< 0.1%
 
48951< 0.1%
 
48941< 0.1%
 
48931< 0.1%
 

fixed_acidity
Real number (ℝ≥0)

Distinct105
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.234308292
Minimum3.8
Maximum15.9
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:40.911091image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum3.8
5-th percentile5.7
Q16.4
median7
Q37.7
95-th percentile9.9
Maximum15.9
Range12.1
Interquartile range (IQR)1.3

Descriptive statistics

Standard deviation1.324478017
Coefficient of variation (CV)0.1830828828
Kurtosis5.01084937
Mean7.234308292
Median Absolute Deviation (MAD)0.6
Skewness1.744912248
Sum35245.55
Variance1.754242018
MonotocityNot monotonic
2020-11-22T03:31:41.077390image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6.82685.5%
 
6.62414.9%
 
6.42274.7%
 
6.92164.4%
 
7.22054.2%
 
6.71934.0%
 
7.11923.9%
 
71883.9%
 
6.51793.7%
 
7.31763.6%
 
Other values (95)278757.2%
 
ValueCountFrequency (%) 
3.81< 0.1%
 
4.22< 0.1%
 
4.42< 0.1%
 
4.51< 0.1%
 
4.62< 0.1%
 
ValueCountFrequency (%) 
15.91< 0.1%
 
15.61< 0.1%
 
15.52< 0.1%
 
151< 0.1%
 
14.31< 0.1%
 

volatile_acidity
Real number (ℝ≥0)

Distinct185
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3435724548
Minimum0.08
Maximum1.58
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:41.240374image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.08
5-th percentile0.16
Q10.23
median0.3
Q30.41
95-th percentile0.68
Maximum1.58
Range1.5
Interquartile range (IQR)0.18

Descriptive statistics

Standard deviation0.1669408145
Coefficient of variation (CV)0.4858969693
Kurtosis3.024510996
Mean0.3435724548
Median Absolute Deviation (MAD)0.08
Skewness1.531190873
Sum1673.885
Variance0.02786923556
MonotocityNot monotonic
2020-11-22T03:31:41.414780image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.242034.2%
 
0.281984.1%
 
0.261984.1%
 
0.251903.9%
 
0.221793.7%
 
0.231673.4%
 
0.271623.3%
 
0.31623.3%
 
0.21603.3%
 
0.321533.1%
 
Other values (175)310063.6%
 
ValueCountFrequency (%) 
0.081< 0.1%
 
0.0851< 0.1%
 
0.091< 0.1%
 
0.140.1%
 
0.10530.1%
 
ValueCountFrequency (%) 
1.581< 0.1%
 
1.332< 0.1%
 
1.241< 0.1%
 
1.181< 0.1%
 
1.131< 0.1%
 

citric_acid
Real number (ℝ≥0)

ZEROS

Distinct89
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3189634647
Minimum0
Maximum1.66
Zeros117
Zeros (%)2.4%
Memory size38.1 KiB
2020-11-22T03:31:41.600190image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.05
Q10.25
median0.31
Q30.39
95-th percentile0.5645
Maximum1.66
Range1.66
Interquartile range (IQR)0.14

Descriptive statistics

Standard deviation0.1459710514
Coefficient of variation (CV)0.4576419167
Kurtosis2.720463079
Mean0.3189634647
Median Absolute Deviation (MAD)0.07
Skewness0.5174778139
Sum1553.99
Variance0.02130754783
MonotocityNot monotonic
2020-11-22T03:31:41.776483image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.32575.3%
 
0.282314.7%
 
0.262024.1%
 
0.322024.1%
 
0.491984.1%
 
0.341883.9%
 
0.291843.8%
 
0.311733.6%
 
0.271683.4%
 
0.241673.4%
 
Other values (79)290259.6%
 
ValueCountFrequency (%) 
01172.4%
 
0.01260.5%
 
0.02390.8%
 
0.03240.5%
 
0.04310.6%
 
ValueCountFrequency (%) 
1.661< 0.1%
 
1.231< 0.1%
 
140.1%
 
0.991< 0.1%
 
0.912< 0.1%
 

residual_sugar
Real number (ℝ≥0)

Distinct299
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.442692939
Minimum0.6
Maximum65.8
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:41.957013image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.6
5-th percentile1.2
Q11.8
median3.05
Q38.1
95-th percentile15
Maximum65.8
Range65.2
Interquartile range (IQR)6.3

Descriptive statistics

Standard deviation4.747205898
Coefficient of variation (CV)0.8722163736
Kurtosis5.758810471
Mean5.442692939
Median Absolute Deviation (MAD)1.75
Skewness1.534818269
Sum26516.8
Variance22.53596384
MonotocityNot monotonic
2020-11-22T03:31:42.118491image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
21723.5%
 
1.41643.4%
 
1.81583.2%
 
1.61583.2%
 
1.21513.1%
 
2.11412.9%
 
2.21382.8%
 
1.91342.8%
 
1.51312.7%
 
2.31272.6%
 
Other values (289)339869.7%
 
ValueCountFrequency (%) 
0.62< 0.1%
 
0.760.1%
 
0.8170.3%
 
0.9280.6%
 
0.952< 0.1%
 
ValueCountFrequency (%) 
65.81< 0.1%
 
31.62< 0.1%
 
26.051< 0.1%
 
23.51< 0.1%
 
22.61< 0.1%
 

chlorides
Real number (ℝ≥0)

Distinct197
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0561430624
Minimum0.009
Maximum0.611
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:42.278483image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.009
5-th percentile0.028
Q10.038
median0.047
Q30.066
95-th percentile0.1
Maximum0.611
Range0.602
Interquartile range (IQR)0.028

Descriptive statistics

Standard deviation0.03523188657
Coefficient of variation (CV)0.6275376702
Kurtosis53.05130162
Mean0.0561430624
Median Absolute Deviation (MAD)0.011
Skewness5.487260087
Sum273.529
Variance0.001241285832
MonotocityNot monotonic
2020-11-22T03:31:42.433768image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.0361623.3%
 
0.0441473.0%
 
0.0421422.9%
 
0.0461412.9%
 
0.0471362.8%
 
0.051342.8%
 
0.0481342.8%
 
0.041302.7%
 
0.0341262.6%
 
0.0451262.6%
 
Other values (187)349471.7%
 
ValueCountFrequency (%) 
0.0091< 0.1%
 
0.01230.1%
 
0.0131< 0.1%
 
0.01440.1%
 
0.01530.1%
 
ValueCountFrequency (%) 
0.6111< 0.1%
 
0.611< 0.1%
 
0.4641< 0.1%
 
0.4221< 0.1%
 
0.4151< 0.1%
 

free_sulfur_dioxide
Real number (ℝ≥0)

Distinct132
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.61514778
Minimum1
Maximum289
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:42.605689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q117
median29
Q341
95-th percentile61
Maximum289
Range288
Interquartile range (IQR)24

Descriptive statistics

Standard deviation17.96642208
Coefficient of variation (CV)0.5868474719
Kurtosis9.92286932
Mean30.61514778
Median Absolute Deviation (MAD)12
Skewness1.38036536
Sum149157
Variance322.7923223
MonotocityNot monotonic
2020-11-22T03:31:42.772313image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
291372.8%
 
61302.7%
 
261202.5%
 
151172.4%
 
311162.4%
 
171152.4%
 
341152.4%
 
241142.3%
 
351132.3%
 
211082.2%
 
Other values (122)368775.7%
 
ValueCountFrequency (%) 
11< 0.1%
 
21< 0.1%
 
3481.0%
 
4410.8%
 
51032.1%
 
ValueCountFrequency (%) 
2891< 0.1%
 
146.51< 0.1%
 
138.51< 0.1%
 
1311< 0.1%
 
1281< 0.1%
 

total_sulfur_dioxide
Real number (ℝ≥0)

Distinct272
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean116.0587028
Minimum6
Maximum440
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:42.940239image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile19
Q178
median118
Q3156
95-th percentile206
Maximum440
Range434
Interquartile range (IQR)78

Descriptive statistics

Standard deviation56.9460764
Coefficient of variation (CV)0.4906661459
Kurtosis-0.3094440636
Mean116.0587028
Median Absolute Deviation (MAD)39
Skewness0.01981624191
Sum565438
Variance3242.855617
MonotocityNot monotonic
2020-11-22T03:31:43.094215image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
111541.1%
 
113501.0%
 
118440.9%
 
110440.9%
 
128430.9%
 
98430.9%
 
132420.9%
 
122410.8%
 
133400.8%
 
131400.8%
 
Other values (262)443190.9%
 
ValueCountFrequency (%) 
62< 0.1%
 
72< 0.1%
 
8100.2%
 
9140.3%
 
10240.5%
 
ValueCountFrequency (%) 
4401< 0.1%
 
366.51< 0.1%
 
3441< 0.1%
 
3131< 0.1%
 
307.51< 0.1%
 

density
Real number (ℝ≥0)

Distinct935
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9947374312
Minimum0.98711
Maximum1.03898
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:43.249636image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.98711
5-th percentile0.9899255
Q10.9924
median0.99497
Q30.997
95-th percentile0.9994
Maximum1.03898
Range0.05187
Interquartile range (IQR)0.0046

Descriptive statistics

Standard deviation0.003031426585
Coefficient of variation (CV)0.003047464074
Kurtosis8.614202103
Mean0.9947374312
Median Absolute Deviation (MAD)0.00227
Skewness0.6344798903
Sum4846.360765
Variance9.189547142e-06
MonotocityNot monotonic
2020-11-22T03:31:43.416767image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.9976541.1%
 
0.9962481.0%
 
0.998481.0%
 
0.992471.0%
 
0.9958471.0%
 
0.9972471.0%
 
0.9966460.9%
 
0.9986460.9%
 
0.9968460.9%
 
0.9928450.9%
 
Other values (925)439890.3%
 
ValueCountFrequency (%) 
0.987111< 0.1%
 
0.987131< 0.1%
 
0.98741< 0.1%
 
0.987422< 0.1%
 
0.987461< 0.1%
 
ValueCountFrequency (%) 
1.038981< 0.1%
 
1.01032< 0.1%
 
1.003692< 0.1%
 
1.00321< 0.1%
 
1.003152< 0.1%
 

pH
Real number (ℝ≥0)

Distinct106
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.217754516
Minimum2.72
Maximum4.01
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:43.573160image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2.72
5-th percentile2.97
Q13.11
median3.2
Q33.32
95-th percentile3.5
Maximum4.01
Range1.29
Interquartile range (IQR)0.21

Descriptive statistics

Standard deviation0.1604890355
Coefficient of variation (CV)0.04987609675
Kurtosis0.3759502829
Mean3.217754516
Median Absolute Deviation (MAD)0.1
Skewness0.3838651215
Sum15676.9
Variance0.02575673053
MonotocityNot monotonic
2020-11-22T03:31:43.746726image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3.161573.2%
 
3.141443.0%
 
3.181362.8%
 
3.191322.7%
 
3.151312.7%
 
3.21312.7%
 
3.221302.7%
 
3.11202.5%
 
3.261172.4%
 
3.171162.4%
 
Other values (96)355873.0%
 
ValueCountFrequency (%) 
2.721< 0.1%
 
2.742< 0.1%
 
2.771< 0.1%
 
2.7930.1%
 
2.830.1%
 
ValueCountFrequency (%) 
4.011< 0.1%
 
3.92< 0.1%
 
3.851< 0.1%
 
3.811< 0.1%
 
3.81< 0.1%
 

sulphates
Real number (ℝ≥0)

Distinct107
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5324753695
Minimum0.22
Maximum2
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:43.913837image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.22
5-th percentile0.35
Q10.43
median0.51
Q30.6
95-th percentile0.79
Maximum2
Range1.78
Interquartile range (IQR)0.17

Descriptive statistics

Standard deviation0.1484505251
Coefficient of variation (CV)0.2787932244
Kurtosis8.668378876
Mean0.5324753695
Median Absolute Deviation (MAD)0.08
Skewness1.790216574
Sum2594.22
Variance0.02203755842
MonotocityNot monotonic
2020-11-22T03:31:44.327854image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.51994.1%
 
0.541813.7%
 
0.461803.7%
 
0.441693.5%
 
0.521583.2%
 
0.381563.2%
 
0.481523.1%
 
0.471493.1%
 
0.491483.0%
 
0.531463.0%
 
Other values (97)323466.4%
 
ValueCountFrequency (%) 
0.221< 0.1%
 
0.231< 0.1%
 
0.2530.1%
 
0.2640.1%
 
0.2790.2%
 
ValueCountFrequency (%) 
21< 0.1%
 
1.952< 0.1%
 
1.611< 0.1%
 
1.591< 0.1%
 
1.561< 0.1%
 

alcohol
Real number (ℝ≥0)

Distinct103
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.48427477
Minimum8
Maximum14.9
Zeros0
Zeros (%)0.0%
Memory size38.1 KiB
2020-11-22T03:31:44.495398image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile8.9
Q19.5
median10.3
Q311.3
95-th percentile12.7
Maximum14.9
Range6.9
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.196296361
Coefficient of variation (CV)0.1141038734
Kurtosis-0.5171509477
Mean10.48427477
Median Absolute Deviation (MAD)0.9
Skewness0.5770674868
Sum51079.38667
Variance1.431124983
MonotocityNot monotonic
2020-11-22T03:31:44.652004image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
9.52795.7%
 
9.42605.3%
 
9.21974.0%
 
101783.7%
 
10.51753.6%
 
9.81623.3%
 
91573.2%
 
111573.2%
 
10.41453.0%
 
9.61443.0%
 
Other values (93)301861.9%
 
ValueCountFrequency (%) 
82< 0.1%
 
8.450.1%
 
8.580.2%
 
8.6210.4%
 
8.7591.2%
 
ValueCountFrequency (%) 
14.91< 0.1%
 
14.21< 0.1%
 
14.051< 0.1%
 
1490.2%
 
13.92< 0.1%
 

quality_level
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
Good
3729 
Excellent
951 
Bad
 
192
ValueCountFrequency (%) 
Good372976.5%
 
Excellent95119.5%
 
Bad1923.9%
 
2020-11-22T03:31:44.807802image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-22T03:31:44.895062image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:44.997541image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length4
Mean length4.936576355
Min length3

wine_type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
white
3656 
red
1216 
ValueCountFrequency (%) 
white365675.0%
 
red121625.0%
 
2020-11-22T03:31:45.119037image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-22T03:31:45.193988image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:45.290787image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length4.500821018
Min length3

Interactions

2020-11-22T03:31:23.861947image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:23.999856image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.114005image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.223779image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.323530image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.422123image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.521715image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.626255image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.720300image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.815837image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:24.913864image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.019957image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.129915image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.228825image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.330737image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.674344image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.781428image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.887142image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:25.990310image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.115173image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.213703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.313664image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.414839image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.519626image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.646634image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.761098image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.866937image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:26.984181image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.096468image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.208234image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.317703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.424426image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.526993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.637311image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.750360image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.871233image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:27.987676image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.107674image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.213161image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.326317image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.433172image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.536206image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.643617image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.746520image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.847490image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:28.954501image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.057603image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.288415image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.400537image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.499718image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.605165image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.711089image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.828482image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:29.934758image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.037172image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.143023image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.242857image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.348429image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.451948image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.562811image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.668804image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.778215image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.881842image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:30.990148image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.093165image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.198688image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.301164image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.407478image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.506940image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.610107image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.715531image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.820919image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:31.924484image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.027814image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.133506image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.239253image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.344080image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.445904image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.552962image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.657311image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.763070image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.865097image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:32.981938image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.111822image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.227899image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.321881image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.416920image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.671117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.778058image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.876795image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:33.972084image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.069865image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.168457image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.266954image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.361993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.463143image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.570784image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.673682image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.772693image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.876353image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:34.976598image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.077025image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.181729image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.282306image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.378555image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.476257image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.578461image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.686191image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.788273image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.888569image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:35.991261image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.098868image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.217359image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.327597image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.433710image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.535317image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.636259image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.747804image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.851558image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:36.961123image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.064143image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.171352image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.281021image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.391344image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.498570image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.607147image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.713822image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.821547image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:37.929717image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.034762image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.143372image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.259464image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.373274image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.476694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.584013image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:38.693571image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.001985image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.118881image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.230501image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.341496image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.453889image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.563188image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.672160image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:39.780190image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-11-22T03:31:45.423842image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-22T03:31:45.614207image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-22T03:31:45.791811image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-22T03:31:45.987215image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-11-22T03:31:46.144368image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-11-22T03:31:40.003062image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-11-22T03:31:40.324141image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

df_indexfixed_acidityvolatile_aciditycitric_acidresidual_sugarchloridesfree_sulfur_dioxidetotal_sulfur_dioxidedensitypHsulphatesalcoholquality_levelwine_type
033725.60.150.265.550.05151.0139.00.993363.470.5011.000000Goodwhite
110947.40.240.318.400.04552.0183.00.996303.090.328.800000Goodwhite
215146.90.840.214.100.07416.065.00.998423.530.729.233333Goodred
38537.70.340.5811.100.03941.0151.00.997803.060.498.600000Goodwhite
437846.50.250.2717.400.06429.0140.00.997763.200.4910.100000Goodwhite
536087.00.160.2514.300.04427.0149.00.998002.910.469.200000Goodwhite
6227.90.430.211.600.10610.037.00.996603.170.919.500000Goodred
71187.20.310.5013.300.05668.0195.00.998203.010.479.200000Goodwhite
810019.90.350.381.500.05831.047.00.996763.260.8210.600000Excellentred
934616.70.240.303.850.042105.0179.00.991893.040.5911.300000Excellentwhite

Last rows

df_indexfixed_acidityvolatile_aciditycitric_acidresidual_sugarchloridesfree_sulfur_dioxidetotal_sulfur_dioxidedensitypHsulphatesalcoholquality_levelwine_type
48629598.00.5900.052.00.08912.032.00.997353.360.6110.0Goodred
486315938.60.1600.497.30.0439.063.00.995303.130.5910.5Goodwhite
48643457.00.6850.001.90.06740.063.00.997903.600.819.9Goodred
486513527.60.6450.031.90.08614.057.00.996903.370.4610.3Goodred
48669648.50.4700.271.90.05818.038.00.995183.160.8511.1Goodred
48673209.80.6600.393.20.08321.059.00.998903.370.7111.5Excellentred
486840606.40.4100.016.10.04820.070.00.993623.190.4210.0Goodwhite
486913467.00.4600.396.20.03946.0163.00.992803.210.3512.2Excellentwhite
487034545.80.5400.001.40.03340.0107.00.989183.260.3512.4Goodwhite
487135826.30.3200.321.50.03712.076.00.989933.300.4612.3Goodwhite